23 research outputs found

    Linguistic Threat Assessment: Understanding Targeted Violence through Computational Linguistics

    Get PDF
    Language alluding to possible violence is widespread online, and security professionals are increasingly faced with the issue of understanding and mitigating this phenomenon. The volume of extremist and violent online data presents a workload that is unmanageable for traditional, manual threat assessment. Computational linguistics may be of particular relevance to understanding threats of grievance-fuelled targeted violence on a large scale. This thesis seeks to advance knowledge on the possibilities and pitfalls of threat assessment through automated linguistic analysis. Based on in-depth interviews with expert threat assessment practitioners, three areas of language are identified which can be leveraged for automation of threat assessment, namely, linguistic content, style, and trajectories. Implementations of each area are demonstrated in three subsequent quantitative chapters. First, linguistic content is utilised to develop the Grievance Dictionary, a psycholinguistic dictionary aimed at measuring concepts related to grievance-fuelled violence in text. Thereafter, linguistic content is supplemented with measures of linguistic style in order to examine the feasibility of author profiling (determining gender, age, and personality) in abusive texts. Lastly, linguistic trajectories are measured over time in order to assess the effect of an external event on an extremist movement. Collectively, the chapters in this thesis demonstrate that linguistic automation of threat assessment is indeed possible. The concluding chapter describes the limitations of the proposed approaches and illustrates where future potential lies to improve automated linguistic threat assessment. Ideally, developers of computational implementations for threat assessment strive for explainability and transparency. Furthermore, it is argued that computational linguistics holds particular promise for large-scale measurement of grievance-fuelled language, but is perhaps less suited to prediction of actual violent behaviour. Lastly, researchers and practitioners involved in threat assessment are urged to collaboratively and critically evaluate novel computational tools which may emerge in the future

    Measuring Emotions in the COVID-19 Real World Worry Dataset

    Get PDF
    The COVID-19 pandemic is having a dramatic impact on societies and economies around the world. With various measures of lockdowns and social distancing in place, it becomes important to understand emotional responses on a large scale. In this paper, we present the first ground truth dataset of emotional responses to COVID-19. We asked participants to indicate their emotions and express these in text. This resulted in the Real World Worry Dataset of 5,000 texts (2,500 short + 2,500 long texts). Our analyses suggest that emotional responses correlated with linguistic measures. Topic modeling further revealed that people in the UK worry about their family and the economic situation. Tweet-sized texts functioned as a call for solidarity, while longer texts shed light on worries and concerns. Using predictive modeling approaches, we were able to approximate the emotional responses of participants from text within 14% of their actual value. We encourage others to use the dataset and improve how we can use automated methods to learn about emotional responses and worries about an urgent problem.Comment: Accepted to ACL 2020 COVID-19 worksho

    Measuring emotions in the COVID-19 real world worry dataset

    Get PDF

    The temporal evolution of a far-right forum

    Get PDF
    The increased threat of right-wing extremist violence necessitates a better understanding of online extremism. Radical message boards, small-scale social media platforms, and other internet fringes have been reported to fuel hatred. The current paper examines data from the right-wing forum Stormfront between 2001 and 2015. We specifically aim to understand the development of user activity and the use of extremist language. Various time-series models depict posting frequency and the prevalence and intensity of extremist language. Individual user analyses examine whether some super users dominate the forum. The results suggest that structural break models capture the forum evolution better than stationary or linear change models. We observed an increase of forum engagement followed by a decrease towards the end of the time range. However, the proportion of extremist language on the forum increased in a step-wise matter until the early summer of 2011, followed by a decrease. This temporal development suggests that forum rhetoric did not necessarily become more extreme over time. Individual user analysis revealed that super forum users accounted for the vast majority of posts and of extremist language. These users differed from normal users in their evolution of forum engagement

    The Grievance Dictionary: Understanding Threatening Language Use

    Get PDF
    This paper introduces the Grievance Dictionary, a psycholinguistic dictionary which can be used to automatically understand language use in the context of grievance-fuelled violence threat assessment. We describe the development the dictionary, which was informed by suggestions from experienced threat assessment practitioners. These suggestions and subsequent human and computational word list generation resulted in a dictionary of 20,502 words annotated by 2,318 participants. The dictionary was validated by applying it to texts written by violent and non-violent individuals, showing strong evidence for a difference between populations in several dictionary categories. Further classification tasks showed promising performance, but future improvements are still needed. Finally, we provide instructions and suggestions for the use of the Grievance Dictionary by security professionals and (violence) researchers.Comment: pre-prin

    Online influence, offline violence:Language use on YouTube surrounding the ‘Unite the Right’ rally

    Get PDF
    The media frequently describes the 2017 Charlottesville ‘Unite the Right’ rally as a turning point for the alt-right and white supremacist movements. Social movement theory suggests that the media attention and public discourse concerning the rally may have engendered changes in social identity performance and visibility of the alt-right, but this has yet to be empirically tested. The presence of the movement on YouTube is of particular interest, as this platform has been referred to as a breeding ground for the alt-right. The current study investigates whether there are differences in language use between 7142 alt-right and progressive YouTube channels, in addition to measuring possible changes as a result of the rally. To do so, we create structural topic models and measure bigram proportions in video transcripts, spanning approximately 2 months before and after the rally. We observe differences in topics between the two groups, with the ‘alternative influencers’, for example, discussing topics related to race and free speech to a larger extent than progressive channels. We also observe structural breakpoints in the use of bigrams at the time of the rally, suggesting there are changes in language use within the two groups as a result of the rally. While most changes relate to mentions of the rally itself, the alternative group also shows an increase in promotion of their YouTube channels. In light of social movement theory, we argue that language use on YouTube shows that the Charlottesville rally indeed triggered changes in social identity performance and visibility of the alt-right

    The Grievance Dictionary: Understanding threatening language use

    Get PDF
    This paper introduces the Grievance Dictionary, a psycholinguistic dictionary that can be used to automatically understand language use in the context of grievance-fueled violence threat assessment. We describe the development of the dictionary, which was informed by suggestions from experienced threat assessment practitioners. These suggestions and subsequent human and computational word list generation resulted in a dictionary of 20,502 words annotated by 2318 participants. The dictionary was validated by applying it to texts written by violent and non-violent individuals, showing strong evidence for a difference between populations in several dictionary categories. Further classification tasks showed promising performance, but future improvements are still needed. Finally, we provide instructions and suggestions for the use of the Grievance Dictionary by security professionals and (violence) researchers

    Online influence, offline violence: language use on YouTube surrounding the ‘Unite the Right’ rally

    Get PDF
    The media frequently describes the 2017 Charlottesville ‘Unite the Right’ rally as a turning point for the alt-right and white supremacist movements. Social movement theory suggests that the media attention and public discourse concerning the rally may have engendered changes in social identity performance and visibility of the alt-right, but this has yet to be empirically tested. The presence of the movement on YouTube is of particular interest, as this platform has been referred to as a breeding ground for the alt-right. The current study investigates whether there are differences in language use between 7142 alt-right and progressive YouTube channels, in addition to measuring possible changes as a result of the rally. To do so, we create structural topic models and measure bigram proportions in video transcripts, spanning approximately 2 months before and after the rally. We observe differences in topics between the two groups, with the ‘alternative influencers’, for example, discussing topics related to race and free speech to a larger extent than progressive channels. We also observe structural breakpoints in the use of bigrams at the time of the rally, suggesting there are changes in language use within the two groups as a result of the rally. While most changes relate to mentions of the rally itself, the alternative group also shows an increase in promotion of their YouTube channels. In light of social movement theory, we argue that language use on YouTube shows that the Charlottesville rally indeed triggered changes in social identity performance and visibility of the alt-right
    corecore